Search CORE

91 research outputs found

CNN-based fast source device identification

Author: Bestagini Paolo
Cozzolino Davide
Mandelli Sara
Tubaro Stefano
Verdoliva Luisa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Source identification is an important topic in image forensics, since it allows to trace back the origin of an image. This represents a precious information to claim intellectual property but also to reveal the authors of illicit materials. In this paper we address the problem of device identification based on sensor noise and propose a fast and accurate solution using convolutional neural networks (CNNs). Specifically, we propose a 2-channel-based CNN that learns a way of comparing camera fingerprint and image noise at patch level. The proposed solution turns out to be much faster than the conventional approach and to ensure an increased accuracy. This makes the approach particularly suitable in scenarios where large databases of images are analyzed, like over social networks. In this vein, since images uploaded on social media usually undergo at least two compression stages, we include investigations on double JPEG compressed images, always reporting higher accuracy than standard approaches

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Università degli studi di Napoli Federico II

Source localization and denoising: a perspective from the TDOA space

Author: Antonacci Fabio
Bestagini Paolo
Canclini Antonio
Compagnoni Marco
Sarti Augusto
Tubaro Stefano
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/04/2016
Field of study

In this manuscript, we formulate the problem of denoising Time Differences of Arrival (TDOAs) in the TDOA space, i.e. the Euclidean space spanned by TDOA measurements. The method consists of pre-processing the TDOAs with the purpose of reducing the measurement noise. The complete set of TDOAs (i.e., TDOAs computed at all microphone pairs) is known to form a redundant set, which lies on a linear subspace in the TDOA space. Noise, however, prevents TDOAs from lying exactly on this subspace. We therefore show that TDOA denoising can be seen as a projection operation that suppresses the component of the noise that is orthogonal to that linear subspace. We then generalize the projection operator also to the cases where the set of TDOAs is incomplete. We analytically show that this operator improves the localization accuracy, and we further confirm that via simulation.Comment: 25 pages, 9 figure

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

An In-Depth Study on Open-Set Camera Model Identification

Author: Bestagini Paolo
Bondi Luca
Júnior Pedro Ribeiro Mendes
Rocha Anderson
Tubaro Stefano
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Camera model identification refers to the problem of linking a picture to the camera model used to shoot it. As this might be an enabling factor in different forensic applications to single out possible suspects (e.g., detecting the author of child abuse or terrorist propaganda material), many accurate camera model attribution methods have been developed in the literature. One of their main drawbacks, however, is the typical closed-set assumption of the problem. This means that an investigated photograph is always assigned to one camera model within a set of known ones present during investigation, i.e., training time, and the fact that the picture can come from a completely unrelated camera model during actual testing is usually ignored. Under realistic conditions, it is not possible to assume that every picture under analysis belongs to one of the available camera models. To deal with this issue, in this paper, we present the first in-depth study on the possibility of solving the camera model identification problem in open-set scenarios. Given a photograph, we aim at detecting whether it comes from one of the known camera models of interest or from an unknown one. We compare different feature extraction algorithms and classifiers specially targeting open-set recognition. We also evaluate possible open-set training protocols that can be applied along with any open-set classifier, observing that a simple of those alternatives obtains best results. Thorough testing on independent datasets shows that it is possible to leverage a recently proposed convolutional neural network as feature extractor paired with a properly trained open-set classifier aiming at solving the open-set camera model attribution problem even to small-scale image patches, improving over state-of-the-art available solutions.Comment: Published through IEEE Access journa

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection

Author: Bestagini Paolo
Mari Daniele
Milani Simone
Salvi Davide
Publication venue
Publication date: 28/07/2023
Field of study

Recent advances in deep learning and computer vision have made the synthesis and counterfeiting of multimedia content more accessible than ever, leading to possible threats and dangers from malicious users. In the audio field, we are witnessing the growth of speech deepfake generation techniques, which solicit the development of synthetic speech detection algorithms to counter possible mischievous uses such as frauds or identity thefts. In this paper, we consider three different feature sets proposed in the literature for the synthetic speech detection task and present a model that fuses them, achieving overall better performances with respect to the state-of-the-art solutions. The system was tested on different scenarios and datasets to prove its robustness to anti-forensic attacks and its generalization capabilities.Comment: Accepted at ECML-PKDD 2023 Workshop "Deep Learning and Multimedia Forensics. Combating fake media and misinformation

arXiv.org e-Print Archive

Anti-Aliasing Add-On for Deep Prior Seismic Data Interpolation

Author: Bestagini Paolo
Lipari Vincenzo
Picetti Francesco
Tubaro Stefano
Publication venue
Publication date: 01/01/2021
Field of study

Data interpolation is a fundamental step in any seismic processing workflow. Among machine learning techniques recently proposed to solve data interpolation as an inverse problem, Deep Prior paradigm aims at employing a convolutional neural network to capture priors on the data in order to regularize the inversion. However, this technique lacks of reconstruction precision when interpolating highly decimated data due to the presence of aliasing. In this work, we propose to improve Deep Prior inversion by adding a directional Laplacian as regularization term to the problem. This regularizer drives the optimization towards solutions that honor the slopes estimated from the interpolated data low frequencies. We provide some numerical examples to showcase the methodology devised in this manuscript, showing that our results are less prone to aliasing also in presence of noisy and corrupted data

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision

Author: Bestagini Paolo
Bonettini Nicolò
Mandelli Sara
Tubaro Stefano
Publication venue
Publication date: 01/01/2020
Field of study

Convolutional Neural Networks (CNNs) have proved very accurate in multiple computer vision image classification tasks that required visual inspection in the past (e.g., object recognition, face detection, etc.). Motivated by these astonishing results, researchers have also started using CNNs to cope with image forensic problems (e.g., camera model identification, tampering detection, etc.). However, in computer vision, image classification methods typically rely on visual cues easily detectable by human eyes. Conversely, forensic solutions rely on almost invisible traces that are often very subtle and lie in the fine details of the image under analysis. For this reason, training a CNN to solve a forensic task requires some special care, as common processing operations (e.g., resampling, compression, etc.) can strongly hinder forensic traces. In this work, we focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. Specifically, we consider the issues that rise from JPEG compression and misalignment of the JPEG grid. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector not losing generalization capability, whereas it is almost possible to ignore these effects for computer vision tasks

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Aligned and Non-Aligned Double JPEG Detection Using Convolutional Neural Networks

Author: Barni Mauro
Bestagini Paolo
Bondi Luca
Bonettini Nicolò
Costanzo Andrea
Maggini Marco
Tondi Benedetta
Tubaro Stefano
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Due to the wide diffusion of JPEG coding standard, the image forensic community has devoted significant attention to the development of double JPEG (DJPEG) compression detectors through the years. The ability of detecting whether an image has been compressed twice provides paramount information toward image authenticity assessment. Given the trend recently gained by convolutional neural networks (CNN) in many computer vision tasks, in this paper we propose to use CNNs for aligned and non-aligned double JPEG compression detection. In particular, we explore the capability of CNNs to capture DJPEG artifacts directly from images. Results show that the proposed CNN-based detectors achieve good performance even with small size images (i.e., 64x64), outperforming state-of-the-art solutions, especially in the non-aligned case. Besides, good results are also achieved in the commonly-recognized challenging case in which the first quality factor is larger than the second one.Comment: Submitted to Journal of Visual Communication and Image Representation (first submission: March 20, 2017; second submission: August 2, 2017

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio della Ricerca - Università degli Studi di Siena

On the use of Benford's law to detect GAN-generated images

Author: Bestagini Paolo
Bonettini Nicolò
Milani Simone
Tubaro Stefano
Publication venue
Publication date: 01/01/2020
Field of study

The advent of Generative Adversarial Network (GAN) architectures has given anyone the ability of generating incredibly realistic synthetic imagery. The malicious diffusion of GAN-generated images may lead to serious social and political consequences (e.g., fake news spreading, opinion formation, etc.). It is therefore important to regulate the widespread distribution of synthetic imagery by developing solutions able to detect them. In this paper, we study the possibility of using Benford's law to discriminate GAN-generated images from natural photographs. Benford's law describes the distribution of the most significant digit for quantized Discrete Cosine Transform (DCT) coefficients. Extending and generalizing this property, we show that it is possible to extract a compact feature vector from an image. This feature vector can be fed to an extremely simple classifier for GAN-generated image detection purpose

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della ricerca - Università di Padova

H4VDM: H.264 Video Device Matching

Author: Bestagini Paolo
Delp Edward J.
Tubaro Stefano
Xiang Ziyue
Publication venue
Publication date: 22/08/2023
Field of study

Methods that can determine if two given video sequences are captured by the same device (e.g., mobile telephone or digital camera) can be used in many forensics tasks. In this paper we refer to this as "video device matching". In open-set video forensics scenarios it is easier to determine if two video sequences were captured with the same device than identifying the specific device. In this paper, we propose a technique for open-set video device matching. Given two H.264 compressed video sequences, our method can determine if they are captured by the same device, even if our method has never encountered the device in training. We denote our proposed technique as H.264 Video Device Matching (H4VDM). H4VDM uses H.264 compression information extracted from video sequences to make decisions. It is more robust against artifacts that alter camera sensor fingerprints, and it can be used to analyze relatively small fragments of the H.264 sequence. We trained and tested our method on a publicly available video forensics dataset consisting of 35 devices, where our proposed method demonstrated good performance

arXiv.org e-Print Archive